Add ZSTD_HASH_USE_CRC32C to use an optimized hash#4611
Add ZSTD_HASH_USE_CRC32C to use an optimized hash#4611danlark1 wants to merge 4 commits intofacebook:devfrom
Conversation
At Google we found 3-5% improvement for densely compressed data when using CRC32C and low bits of its hash. Most uplift should be seen with at least `-msse4.2` flag for x86-64 and march=armv8-a+crc extension (from Arm8.3 CRC is mandatory). In this patch we added a macro, fixed implementation details which relied on shifts and added CI tests with different configurations. We also added a crc32c regression test to compare sizes and verify it's not regressing the values too much.
|
Everything looks reasonable to me! I'm going to spend some time testing this internally, and will get back to you. To fix the determinism test, you just need to add zstd/lib/common/portability_macros.h Lines 171 to 188 in d7ee320 That will opt the build out of the determinism test. |
|
Hey @terrelln, were you able to test in your environment? To be clear, we observed performance improvements on quite extensive real data benchmarks on production servers but it's not a guarantee before we roll out to prod. On laptops like Mac M3-M4 I was not able to see an improvement. Absolutely fine if you don't see improvements, that would give us more understanding if this patch is better or not. If you don't see any uplift, we will probably roll out to prod for some time and get back to you with harder data. We are probably afraid of safety and your feedback if this patch is safe is super valuable. |
At Google we found 3-5% improvement for densely
compressed data when using CRC32C and low bits
of its hash.
Most uplift should be seen with at least
-msse4.2flagfor x86-64 and march=armv8-a+crc extension
(from Arm8.3 CRC is mandatory).
In this patch we added a macro, fixed implementation
details which relied on shifts and added CI tests with
different configurations. We also added a crc32c
regression test to compare sizes and verify it's not
regressing the values too much.